Modeling Others using Oneself in Multi-Agent Reinforcement Learning
نویسندگان
چکیده
We consider the multi-agent reinforcement learning setting with imperfect information in which each agent is trying to maximize its own utility. The reward function depends on the hidden state (or goal) of both agents, so the agents must infer the other players’ hidden goals from their observed behavior in order to solve the tasks. We propose a new approach for learning in these domains: Self Other-Modeling (SOM), in which an agent uses its own policy to predict the other agent’s actions and update its belief of their hidden state in an online manner. We evaluate this approach on three different tasks and show that the agents are able to learn better policies using their estimate of the other players’ hidden states, in both cooperative and adversarial settings.
منابع مشابه
Utilizing Generalized Learning Automata for Finding Optimal Policies in MMDPs
Multi agent Markov decision processes (MMDPs), as the generalization of Markov decision processes to the multi agent case, have long been used for modeling multi agent system and are used as a suitable framework for Multi agent Reinforcement Learning. In this paper, a generalized learning automata based algorithm for finding optimal policies in MMDP is proposed. In the proposed algorithm, MMDP ...
متن کاملVoltage Coordination of FACTS Devices in Power Systems Using RL-Based Multi-Agent Systems
This paper describes how multi-agent system technology can be used as the underpinning platform for voltage control in power systems. In this study, some FACTS (flexible AC transmission systems) devices are properly designed to coordinate their decisions and actions in order to provide a coordinated secondary voltage control mechanism based on multi-agent theory. Each device here is modeled as ...
متن کاملSimulation of Self-Control through Precommitment Behaviour in an Evolutionary System
The purpose of this thesis is to determine how evolution has resulted in selfcontrol through precommitment behaviour. Empirical data in psychology suggest that we recognize we have self-control problems and attempt to overcome them by exercising precommmitment, which bias our future choices to a larger, later reward. The behavioral model of self-control as an internal process is taken from psyc...
متن کاملApplications of Game theory in multi-agent reinforcement learning
Multi-agent systems are a fast growing paradigm for problem solving and its applications are growing every day. Adaptivity is one of the key features of a Multi-agent system, which involves learning. Unfortunately due to extreme complexity of the environment in which the agents interact and the effect of each ones actions on the others, multi-agent learning is still an open problem. In this pap...
متن کاملMulti-Agent Evolutionary Game Dynamics and Reinforcement Learning Applied to Online Optimization of Traffic Policy
This chapter demonstrates an application of agent-based selection dynamics to the traffic assignment problem. We introduce an evolutionary dynamic approach that acquires payoff data from multi-agent reinforcement learning to enable a adaptive optimization of traffic assignment, provided that classical theories of traffic user equilibrium pose the problem as one of global optimization. We then s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1802.09640 شماره
صفحات -
تاریخ انتشار 2018